fix: e2e-triage and e2e-fix workflows by alishakawaguchi · Pull Request #757 · entireio/cli

alishakawaguchi · 2026-03-23T19:22:08Z

Summary

claude-code-action@v1 does not install project plugins, so /e2e:triage-ci and /e2e:implement slash commands were silently not resolved
The triage step completed in 21ms with $0 API cost — the model was never called, producing empty triage.md output
Replaced slash commands with explicit "Read and follow .claude/skills/e2e/..." instructions that use the Read tool (already in allowedTools) to load skill files directly

Test plan

Re-trigger the E2E Triage workflow against the same failing run to confirm Claude produces non-empty triage output
Verify triage artifact contains actual findings (not empty)
Verify plan artifact contains an actual implementation plan

🤖 Generated with Claude Code

Note

Low Risk
Low risk workflow-only change, but it affects automated triage/plan generation prompts and could alter or break CI triage output if the referenced skill files change or the instructions are misinterpreted.

Overview
The E2E triage workflow now removes /e2e:triage-ci and /e2e:implement slash-command prompts and instead tells claude-code-action@v1 to Read and follow the procedures in .claude/skills/e2e/triage-ci.md and .claude/skills/e2e/implement.md.

It also makes the prompts explicitly pass the artifact path or run URL, agent, and SHA, and clarifies that CI artifact analysis should skip local re-run steps when a local artifact path is provided.

^{Written by Cursor Bugbot for commit 2ba0a47. Configure here.}

…iage workflow claude-code-action@v1 does not install project plugins, so /e2e:triage-ci and /e2e:implement slash commands were not resolved. The triage step completed in 21ms with $0 API cost — the model was never called, producing empty output. Replace slash commands with explicit "Read and follow" instructions that use the Read tool (already in allowedTools) to load skill files directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 77de372b0fce

Copilot

Pull request overview

Updates the E2E triage GitHub Actions workflow prompts to avoid relying on unresolved Claude Code project slash commands, by instructing the model to Read and follow the repo’s .claude/skills/e2e/* procedures directly.

Changes:

Replaces /e2e:triage-ci ... usage with explicit instructions to read .claude/skills/e2e/triage-ci.md, including passing inputs (artifact path / run URL, agent, SHA).
Replaces /e2e:implement ... usage with instructions to read .claude/skills/e2e/implement.md and then read the generated triage.md.

Copilot · 2026-03-23T19:25:45Z

.github/workflows/e2e-triage.yml

          prompt: |
-            /e2e:triage-ci ${{ env.RUN_URL }} --agent ${{ matrix.agent }} --sha ${{ needs.matrix-setup.outputs.sha }}
+            Read and follow the full E2E triage procedure from .claude/skills/e2e/triage-ci.md.
+
+            Inputs:
+            - CI run URL: ${{ env.RUN_URL }}
+            - Agent: ${{ matrix.agent }}
+            - SHA: ${{ needs.matrix-setup.outputs.sha }}


In rerun mode (Run triage (with re-runs)), the prompt provides a CI run URL as an input. The referenced procedure (.claude/skills/e2e/triage-ci.md, Step L1) explicitly says that when a CI run reference is provided, it should download artifacts and skip Steps L2–L5 (local re-runs). That means this rerun job path is likely to never execute the intended local re-runs, despite the extra setup/timeouts. Consider updating the prompt (or the skill) so rerun mode still performs Steps L2–L5 after identifying failing tests (e.g., use the run URL only to discover failures, then run local re-runs for verification).

Copilot · 2026-03-23T19:25:46Z

.github/workflows/e2e-triage.yml

+            Read and follow the fix implementation procedure from .claude/skills/e2e/implement.md.
+
+            Read the triage findings at ${{ github.workspace }}/e2e-triage-artifacts/${{ matrix.agent }}/triage.md for agent ${{ matrix.agent }}.


The plan step prompt points to .claude/skills/e2e/implement.md, which mandates entering plan mode via /plan and later requires running real E2E tests before summary. In this workflow step you're only trying to generate a plan artifact; these instructions can push the agent to attempt expensive/long-running E2E executions or rely on /plan being recognized. Consider adding an explicit constraint in the prompt to only produce the implementation plan (use the EnterPlanMode tool if needed) and stop before applying changes or running tests.

Suggested change

Read and follow the fix implementation procedure from .claude/skills/e2e/implement.md.

Read the triage findings at ${{ github.workspace }}/e2e-triage-artifacts/${{ matrix.agent }}/triage.md for agent ${{ matrix.agent }}.

You are running in a planning-only workflow step. In this step you MUST NOT run or trigger any real E2E tests, shell commands, or code changes, and you MUST ONLY produce a written implementation plan.

1. Read and use only the planning/implementation design guidance from .claude/skills/e2e/implement.md. If that document instructs you to enter plan mode via `/plan` or to run tests before summarizing, treat those as requirements for a future implementation step, not for this step. Do not actually run tests, apply changes, or rely on `/plan` being recognized here.

2. Read the triage findings at ${{ github.workspace }}/e2e-triage-artifacts/${{ matrix.agent }}/triage.md for agent ${{ matrix.agent }}.

3. Based on these inputs, write a concise, step-by-step implementation plan describing how a human or a later workflow should fix the identified issues.

Your output must be plain text only and must NOT include `/plan` commands, tool invocations, or instructions that this very agent run should execute tests or modify code. Stop after writing the plan.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

Comment @cursor review or bugbot run to trigger another review on this PR

cursor · 2026-03-23T19:27:04Z

.github/workflows/e2e-triage.yml

+            Inputs:
+            - CI run URL: ${{ env.RUN_URL }}
+            - Agent: ${{ matrix.agent }}
+            - SHA: ${{ needs.matrix-setup.outputs.sha }}


Re-run path missing override to prevent skipping local tests

Medium Severity

The re-run triage prompt provides a CI run URL input, but triage-ci.md explicitly instructs the model to "skip Steps L2-L5 (local re-runs)" when a CI run reference is given. The analysis-only path correctly includes a redundant skip instruction, but the re-run path omits the opposite override — telling the model to actually execute L2-L5. This means the model will likely skip local re-runs, making the entire re-run path (agent CLI installation, bootstrap, tmux, Bash tool access) useless.

claude-code-action's OIDC token exchange requires the workflow file to match the default branch, preventing testing on feature branches. Pass github_token directly to bypass this restriction. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: d73c731f6abc

Merge the two claude-code-action invocations (triage + plan) into a single prompt that runs in plan mode. Claude writes both triage findings and fix plan to the plan file, which is then extracted and split into triage.md and plan.md artifacts. Benefits: - Single invocation reduces cost (~$0.60 vs $0.86) - Plan mode gives structured reasoning for fix plans - Plan content captured from file (fixes empty plan artifact) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 9a30a28f697c

- Set display_report: false to remove tool call noise from summary tab - Add custom job summary step that shows triage + plan markdown - Remove redundant "Upload plan artifact" (triage upload has both files) - Copy execution.json to artifacts for debugging Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 609909ae979f

…tion, log Slack errors - Collapse per-agent matrix into single triage job so Claude can correlate failures across agents and find shared root causes - Add "triage starting" Slack notification in setup job before triage begins - Log actual Slack API error field in post-slack-message.sh (was silently swallowing the error, making failures impossible to diagnose) - Update e2e-fix workflow to download single unified triage artifact - URL-encode failed_agents in Fix It URL since it may contain commas Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 74562887fe7f

…te it The "Post triage starting" step was silently skipped because env.SLACK_BOT_TOKEN was only set at the step level, but GitHub Actions evaluates if: conditions before step env is applied. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 585bce40d0d2

Slack thread_ts must be in dot-decimal format (e.g., "1482960137.003543"). The dot gets stripped somewhere in the dispatch pipeline (URL encoding or GitHub Actions numeric coercion), causing Slack to reject with invalid_thread_ts. Re-insert the dot assuming 6 decimal places. Ref: https://api.slack.com/messaging/retrieving Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 0009b566bbbc

- Simplify "triage started" message: remove agent list, link to triage run - Merge "triage completion" and "fix plan" steps into single "triage result" - Use Block Kit with green "Fix It" button instead of plain text link - Remove raw plan/markdown dump from Slack thread - Add --payload flag to post-slack-message.sh for Block Kit support Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: be8091cb744b

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 6ce3944ef17d

- Add thread_ts dot normalization (same fix as triage workflow) - Simplify "fix started" message with emoji, link to fix run - Success: broadcast ":review: E2E fix applied: <PR_URL>" to channel using reply_broadcast + unfurl_links - Failure: ":x: E2E fix failed" with run link (thread only) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: dc9d1a373abd

After applying fixes, the workflow now installs agent CLIs and runs the actual failing E2E tests twice per agent to confirm the fix works. If verification fails, Claude Code gets one more attempt with the failure output as context, then tests run again. PR is only created after E2E verification passes, and Slack messages now report verification status (attempt count, pass/fail). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 8eae20f22eab

The fix workflow now only requires triage_run_id — run_url and failed_agents are auto-detected from metadata.json in the triage artifacts. Explicit inputs still take precedence for backward compatibility with the Slack "Fix It" button. The triage workflow now writes metadata.json (run_url, failed_agents, sha) alongside its existing plan/triage artifacts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: ce6d58f81698

gh run download expects a numeric run ID but the Cloudflare Worker or manual trigger may pass a full GitHub Actions URL. Extract the numeric ID from the URL before calling gh run download. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 0c1ad5d2c425

The action validates that the workflow file matches the default branch. Passing github_token allows it to authenticate and run on feature branches. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 887ffbfe48ff

Copilot AI review requested due to automatic review settings March 23, 2026 19:22

alishakawaguchi self-assigned this Mar 23, 2026

Copilot started reviewing on behalf of alishakawaguchi March 23, 2026 19:22 View session

Copilot AI reviewed Mar 23, 2026

View reviewed changes

cursor bot reviewed Mar 23, 2026

View reviewed changes

alishakawaguchi and others added 13 commits March 23, 2026 12:34

fix: add context line explaining Fix It button creates a draft PR

5a2f90e

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 6ce3944ef17d

alishakawaguchi changed the title ~~fix: replace slash commands with explicit Read instructions in e2e-triage workflow~~ fix: e2e-triage and e2e-fix workflows Mar 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: e2e-triage and e2e-fix workflows#757

fix: e2e-triage and e2e-fix workflows#757
alishakawaguchi wants to merge 14 commits intomainfrom
e2e-triage-fix

alishakawaguchi commented Mar 23, 2026 •

edited by cursor bot

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 23, 2026

Uh oh!

Copilot AI Mar 23, 2026

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

		Read and follow the fix implementation procedure from .claude/skills/e2e/implement.md.

		Read the triage findings at ${{ github.workspace }}/e2e-triage-artifacts/${{ matrix.agent }}/triage.md for agent ${{ matrix.agent }}.

-            Read and follow the fix implementation procedure from .claude/skills/e2e/implement.md.
-            Read the triage findings at ${{ github.workspace }}/e2e-triage-artifacts/${{ matrix.agent }}/triage.md for agent ${{ matrix.agent }}.
+            You are running in a planning-only workflow step. In this step you MUST NOT run or trigger any real E2E tests, shell commands, or code changes, and you MUST ONLY produce a written implementation plan.
+. Read and use only the planning/implementation design guidance from .claude/skills/e2e/implement.md. If that document instructs you to enter plan mode via `/plan` or to run tests before summarizing, treat those as requirements for a future implementation step, not for this step. Do not actually run tests, apply changes, or rely on `/plan` being recognized here.
+. Read the triage findings at ${{ github.workspace }}/e2e-triage-artifacts/${{ matrix.agent }}/triage.md for agent ${{ matrix.agent }}.
+. Based on these inputs, write a concise, step-by-step implementation plan describing how a human or a later workflow should fix the identified issues.
+            Your output must be plain text only and must NOT include `/plan` commands, tool invocations, or instructions that this very agent run should execute tests or modify code. Stop after writing the plan.

Conversation

alishakawaguchi commented Mar 23, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 23, 2026

Choose a reason for hiding this comment

Re-run path missing override to prevent skipping local tests

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

alishakawaguchi commented Mar 23, 2026 •

edited by cursor bot

Loading